A Similarity-based Approach to Match Elements Across Versions of XML Documents

نویسندگان

  • Fernando Campello
  • Bruno Pinto
  • Gabriel Tessarolli
  • Alessandreia Marta de Oliveira
  • Carlos Roberto Carvalho Oliveira
  • Márcio Tadeu Oliveira Júnior
  • Leonardo Gresta Paulino Murta
  • Vanessa Braganholo
چکیده

XML documents are often used to provide inter-system interoperability. A related problem is that XML documents evolve over time, so identifying and understanding the changes they undergo become crucial. Some diff approaches based on syntactic and semantic analysis of the documents have been developed to address this problem. The strategy is to find data fragments that are identical in both versions of an XML document and match the corresponding elements through the use of context keys. However, depending on how XML documents are managed, there is no guarantee that the values of these keys remain the same across versions. Thus, differently from existing approaches, this paper proposes the use of similarity to match corresponding elements across XML versions, rather than key equality. It also shows how this can be applied to support both syntactic and semantic XML diff applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Semantic and Structure Based XML Similarity: The XS3 Prototype

Due to the ever-increasing web availability of XML-based data, an efficient approach to compare XML documents becomes crucial in information retrieval. Such comparison of XML documents has applications in version control (finding, scoring and browsing changes between different versions of a document), change management and data warehousing (support of temporal queries and index maintenance) [3,...

متن کامل

Identifying Structural Mapping between XML Fragments

With the popularity of XML for representing & exchanging data, the requirement for agreement between parties on common Schema or DTD has become significant. Getting various parties to agree on a common standard is often time consuming and complex problem. This work suggests an approach to identify mapping between XML documents with different schemas. Introduction Identifying mapping between two...

متن کامل

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Similarity of XML-Schema Elements: A Structural and Information Content Approach

EXtensible Markup Language (XML)-Schemas are the emerging standards for describing and validating semi-structured documents across the Internet, due to the rich set of modeling constructors, types and constraints they provide. Semantic similarity is growing in importance in different settings, such as digital libraries, heterogeneous databases and, in particular, the Semantic Web. The focus of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014